A continuous VQ clustering algorithm for realtime speech recognition

نویسندگان

  • Xixian Chen
  • Changnian Cai
چکیده

This paper presents a continuous VQ clustering (CVQC) algorithm for realtime speech recognition, which incorporates the temporal information of speech into both training and recognition processes. In comparison with the conventional DTW and VQ methods, this new algorithm delivers faster training and recognition speed and smaller codebook size while still retains merits of both. Realtime implementation is emphasized in the design of sophisticated algorithms. And a custom available voice controlled computer command input system based on CVQC is also introduced.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering beyond phoneme contexts for speech recognition

The clustering of using decision trees is generalized to take into account high-level knowledge sources to better model the co-articulation e ects in large vocabulary continuous speech recognition. VQ models are used to reduce the computational cost in constructing decision trees. The search algorithm is designed such that it can provide a general type of information for decision trees without ...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition

This paper addresses the problem of the detection of speaker changes and clustering speakers when no information is available regarding speaker classes or even the total number of classes. We assume that no previous information on speakers is available (no speaker model, no training phase) and that people do not speak simultaneously. The aim is to apply speaker grouping information to speaker a...

متن کامل

Evaluation of ETSI advanced DSR front-end and bias removal method on the Japanese newspaper article sentences speech corpus

In October 2002, European Telecommunications Standards Institute (ETSI) recommended a standard Distributed Speech Recognition (DSR) advanced front-end, ETSI ES202 050 version 1.1.1 (ES202). Many studies use this front-end in noise environments on several languages on connected digit recognition tasks. However, we have not seen the reports of large vocabulary continuous speech recognition using ...

متن کامل

A New Vector Quantization Front-End Process for Discrete HMM Speech Recognition System

The paper presents a complete discrete statistical framework, based on a novel vector quantization (VQ) front-end process. This new VQ approach performs an optimal distribution of VQ codebook components on HMM states. This technique that we named the distributed vector quantization (DVQ) of hidden Markov models, succeeds in unifying acoustic micro-structure and phonetic macro-structure, when th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1989